Search | VHL Regional Portal

Predicting lncRNA-protein interactions with bipartite graph embedding and deep graph neural networks.

Ma, Yuzhou; Zhang, Han; Jin, Chen; Kang, Chuanze.

Front Genet ; 14: 1136672, 2023.

Article in English | MEDLINE | ID: mdl-36845380

ABSTRACT

Background: Long non-coding RNAs (lncRNAs) play crucial roles in numerous biological processes. Investigation of the lncRNA-protein interaction contributes to discovering the undetected molecular functions of lncRNAs. In recent years, increasingly computational approaches have substituted the traditional time-consuming experiments utilized to crack the possible unknown associations. However, significant explorations of the heterogeneity in association prediction between lncRNA and protein are inadequate. It remains challenging to integrate the heterogeneity of lncRNA-protein interactions with graph neural network algorithms. Methods: In this paper, we constructed a deep architecture based on GNN called BiHo-GNN, which is the first to integrate the properties of homogeneous with heterogeneous networks through bipartite graph embedding. Different from previous research, BiHo-GNN can capture the mechanism of molecular association by the data encoder of heterogeneous networks. Meanwhile, we design the process of mutual optimization between homogeneous and heterogeneous networks, which can promote the robustness of BiHo-GNN. Results: We collected four datasets for predicting lncRNA-protein interaction and compared the performance of current prediction models on benchmarking dataset. In comparison with the performance of other models, BiHo-GNN outperforms existing bipartite graph-based methods. Conclusion: Our BiHo-GNN integrates the bipartite graph with homogeneous graph networks. Based on this model structure, the lncRNA-protein interactions and potential associations can be predicted and discovered accurately.

TLCrys: Transfer Learning Based Method for Protein Crystallization Prediction.

Jin, Chen; Shi, Zhuangwei; Kang, Chuanze; Lin, Ken; Zhang, Han.

Int J Mol Sci ; 23(2)2022 Jan 16.

Article in English | MEDLINE | ID: mdl-35055158

ABSTRACT

X-ray diffraction technique is one of the most common methods of ascertaining protein structures, yet only 2-10% of proteins can produce diffraction-quality crystals. Several computational methods have been proposed so far to predict protein crystallization. Nevertheless, the current state-of-the-art computational methods are limited by the scarcity of experimental data. Thus, the prediction accuracy of existing models hasn't reached the ideal level. To address the problems above, we propose a novel transfer-learning-based framework for protein crystallization prediction, named TLCrys. The framework proceeds in two steps: pre-training and fine-tuning. The pre-training step adopts attention mechanism to extract both global and local information of the protein sequences. The representation learned from the pre-training step is regarded as knowledge to be transferred and fine-tuned to enhance the performance of crystalization prediction. During pre-training, TLCrys adopts a multi-task learning method, which not only improves the learning ability of protein encoding, but also enhances the robustness and generalization of protein representation. The multi-head self-attention layer guarantees that different levels of the protein representation can be extracted by the fine-tuned step. During transfer learning, the fine-tuning strategy used by TLCrys improves the task-specialized learning ability of the network. Our method outperforms all previous predictors significantly in five crystallization stages of prediction. Furthermore, the proposed methodology can be well generalized to other protein sequence classification tasks.

Subject(s)

Computational Biology/methods , Proteins/chemistry , Algorithms , Crystallization , Machine Learning

OTNet: A CNN Method Based on Hierarchical Attention Maps for Grading Arteriosclerosis of Fundus Images with Small Samples.

Bai, Hang; Gao, Li; Quan, Xiongwen; Zhang, Han; Gao, Shuo; Kang, Chuanze; Qi, Jiaqiang.

Interdiscip Sci ; 14(1): 182-195, 2022 Mar.

Article in English | MEDLINE | ID: mdl-34536209

ABSTRACT

The severity of fundus arteriosclerosis can be determined and divided into four grades according to fundus images. Automatically grading of the fundus arteriosclerosis is helpful in clinical practices, so this paper proposes a convolutional neural network (CNN) method based on hierarchical attention maps to solve the automatic grading problem. First, we use the retinal vessel segmentation model to separate the important vascular region and the non-vascular background region from the fundus image and obtain two attention maps. The two maps are regarded as inputs to construct a two-stream CNN (TSNet), to focus on feature information through mutual reference between the two regions. In addition, we use convex hull attention maps in the one-stream CNN (OSNet) to learn valuable areas where the retinal vessels are concentrated. Then, we design an integrated OTNet model which is composed of TSNet that learns image feature information and OSNet that learns discriminative areas. After obtaining the representation learning parts of the two networks, we can train the classification layer to achieve better results. Our proposed TSNet reaches the AUC value of 0.796 and the ACC value of 0.592 on the testing set, and the integrated model OTNet reaches the AUC value of 0.806 and the ACC value of 0.606, which are better than the results of other comparable models. As far as we know, this is the first attempt to use deep learning to classify the severity of atherosclerosis in fundus images. The prediction results of our proposed method can be accepted by doctors, which shows that our method has a certain application value.

Subject(s)

Algorithms , Arteriosclerosis , Arteriosclerosis/diagnostic imaging , Attention , Fundus Oculi , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer

LR-GNN: a graph neural network based on link representation for predicting molecular associations.

Kang, Chuanze; Zhang, Han; Liu, Zhuo; Huang, Shenwei; Yin, Yanbin.

Brief Bioinform ; 23(1)2022 01 17.

Article in English | MEDLINE | ID: mdl-34889446

ABSTRACT

In biomedical networks, molecular associations are important to understand biological processes and functions. Many computational methods, such as link prediction methods based on graph neural networks (GNNs), have been successfully applied in discovering molecular relationships with biological significance. However, it remains a challenge to explore a method that relies on representation learning of links for accurately predicting molecular associations. In this paper, we present a novel GNN based on link representation (LR-GNN) to identify potential molecular associations. LR-GNN applies a graph convolutional network (GCN)-encoder to obtain node embedding. To represent associations between molecules, we design a propagation rule that captures the node embedding of each GCN-encoder layer to construct the LR. Furthermore, the LRs of all layers are fused in output by a designed layer-wise fusing rule, which enables LR-GNN to output more accurate results. Experiments on four biomedical network data, including lncRNA-disease association, miRNA-disease association, protein-protein interaction and drug-drug interaction, show that LR-GNN outperforms state-of-the-art methods and achieves robust performance. Case studies are also presented on two datasets to verify the ability to predict unknown associations. Finally, we validate the effectiveness of the LR by visualization.

Subject(s)

Computational Biology/methods , Neural Networks, Computer , Algorithms , Biomedical Technology/methods , Cell Communication , Deep Learning , Drug Interactions , Humans , MicroRNAs , Protein Interaction Domains and Motifs , RNA, Long Noncoding , Research Design

Automatic arteriosclerotic retinopathy grading using four-channel with image merging.

Gao, Shuo; Gao, Li; Quan, Xiongwen; Zhang, Han; Bai, Hang; Kang, Chuanze.

Comput Methods Programs Biomed ; 208: 106274, 2021 Sep.

Article in English | MEDLINE | ID: mdl-34325376

ABSTRACT

BACKGROUND AND OBJECTIVE: Arteriosclerosis can reflect the severity of hypertension, which is one of the main diseases threatening human life safety. But Arteriosclerosis retinopathy detection involves costly and time-consuming manual assessment. To meet the urgent needs of automation, this paper developed a novel arteriosclerosis retinopathy grading method based on convolutional neural network. METHODS: Firstly, we propose a good scheme for extracting features facing the fundus blood vessel background using image merging for contour enhancement. In this step, the original image is dealt with adaptive threshold processing to generate the new contour channel, which merge with the original three-channel image. Then, we employ the pre-trained convolutional neural network with transfer learning to speed up training and contour image channel parameter with Kaiming initialization. Moreover, ArcLoss is applied to increase inter-class differences and intra-class similarity aiming to the high similarity of images of different classes in the dataset. RESULTS: The accuracy of arteriosclerosis retinopathy grading achieved by the proposed method is up to 65.354%, which is nearly 4% higher than those of the exiting methods. The Kappa of our method is 0.508 in arteriosclerosis retinopathy grading. CONCLUSIONS: An experimental study on multiple metrics demonstrates the superiority of our method, which will be a useful to the toolbox for arteriosclerosis retinopathy grading.

Subject(s)

Arteriosclerosis , Retinal Diseases , Automation , Fundus Oculi , Humans , Image Processing, Computer-Assisted , Neural Networks, Computer , Retinal Diseases/diagnostic imaging

SGL-SVM: A novel method for tumor classification via support vector machine with sparse group Lasso.

Huo, Yanhao; Xin, Lihui; Kang, Chuanze; Wang, Minghui; Ma, Qin; Yu, Bin.

J Theor Biol ; 486: 110098, 2020 02 07.

Article in English | MEDLINE | ID: mdl-31786183

ABSTRACT

At present, with the in-depth study of gene expression data, the significant role of tumor classification in clinical medicine has become more apparent. In particular, the sparse characteristics of gene expression data within and between groups. Therefore, this paper focuses on the study of tumor classification based on the sparsity characteristics of genes. On this basis, we propose a new method of tumor classification-Sparse Group Lasso (least absolute shrinkage and selection operator) and Support Vector Machine (SGL-SVM). Firstly, the primary selection of feature genes is performed on the normalized tumor datasets using the Kruskal-Wallis rank sum test. Secondly, using a sparse group Lasso for further selection, and finally, the support vector machine serves as a classifier for classification. We validate proposed method on microarray and NGS datasets respectively. Formerly, on three two-class and five multi-class microarray datasets it is tested by 10-fold cross-validation and compared with other three classifiers. SGL-SVM is then applied on BRCA and GBM datasets and tested by 5-fold cross-validation. Satisfactory accuracy is obtained by above experiments and compared with other proposed methods. The experimental results show that the proposed method achieves a higher classification accuracy and selects fewer feature genes, which can be widely applied in classification for high-dimensional and small-sample tumor datasets. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/SGL-SVM/.

Subject(s)

Neoplasms , Support Vector Machine , Algorithms , Gene Expression Profiling , Humans , Microarray Analysis , Neoplasms/genetics , Software

Feature selection and tumor classification for microarray data using relaxed Lasso and generalized multi-class support vector machine.

Kang, Chuanze; Huo, Yanhao; Xin, Lihui; Tian, Baoguang; Yu, Bin.

J Theor Biol ; 463: 77-91, 2019 02 21.

Article in English | MEDLINE | ID: mdl-30537483

ABSTRACT

At present, the study of gene expression data provides a reference for tumor diagnosis at the molecular level. It is a challenging task to select the feature genes related to the classification from the high-dimensional and small-sample gene expression data and successfully separate the different subtypes of tumor or between the normal and patient. In this paper, we present a new method for tumor classification-relaxed Lasso (least absolute shrinkage and selection operator) and generalized multi-class support vector machine (rL-GenSVM). The tumor datasets are firstly z-score normalized. Secondly, using relaxed Lasso to select feature gene sets on training set, and finally, generalized multi-class support vector machine (GenSVM) serves as a classifier. We select four two-class datasets and four multi-class datasets for experiments. And four classifiers are used to predict and compare the classification accuracy on test set. To compare with other proposed methods, we obtain satisfactory classification accuracy by 10-fold cross-validation on all samples of each dataset. The experimental results show that the method proposed in this paper selects fewer feature genes and achieves higher classification accuracy. rL-GenSVM uses regularization parameters to avoid overfitting and can be widely applied to high-dimensional and small-sample tumor data classification. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/rL-GenSVM/.

Subject(s)

Datasets as Topic , Neoplasms/classification , Oligonucleotide Array Sequence Analysis , Support Vector Machine , Gene Expression Profiling , Humans , Neoplasms/genetics , Software

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL